EN FR
EN FR


Section: New Software and Platforms

GEP-PG

Goal Exploration Process - Policy Gradient

Keywords: Machine learning - Deep learning

Functional Description: Reinforcement Learning algorithm working with OpenAI Gym environments. A first phase implements exploration using a Goal Exploration Process (GEP). Samples collected during exploration are then transferred to the memory of a deep reinforcement learning algorithm (deep deterministic policy gradient or DDPG). DDPG then starts learning from a pre-initialized memory so as to maximize the sum of discounted rewards given by the environment.

  • Contact: Cedric Colas